Fuzzy K-Means Clustering on a High Dimensional Semantic Space

نویسندگان

  • Guihong Cao
  • Dawei Song
  • Peter Bruza
چکیده

One way of representing semantics could be via a high dimensional conceptual space constructed by certain lexical semantic space models. Concepts (words), represented as a vector of other words in the semantic space, can be categorized via clustering techniques into a number of regions reflecting different contexts. The conventional clustering algorithms, e.g., K-means method, however, normally produce crisp clusters, i.e., an object could be assigned to only one cluster. It is sometimes not the case in reality. Therefore, a fuzzy membership function can be applied to the K-Means clustering, which models the degree of an object belonging to certain cluster. This paper introduces a fuzzy k-means clustering algorithm and how it is used to word clustering on the high dimensional semantic space constructed by a cognitively motivated semantic space model, namely Hyperspace Analogue to Language. A case study demonstrates the method is promising.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Concept Induction via Fuzzy C-Means Clustering in a High Dimensional Semantic Space

Lexical semantic space models have recently been investigated to automatically derive the meaning (semantics) of information based on natural language usage. In a semantic space, a term can be considered as a concept represented geometrically as a vector, the components of which correspond to terms in a vocabulary. A primary way to perform reasoning in a semantic space is to categorize concepts...

متن کامل

Discourse Type Clustering using POS n-gram Profiles and High-Dimensional Embeddings

To cluster textual sequence types (discourse types/modes) in French texts, K-means algorithm with high-dimensional embeddings and fuzzy clustering algorithm were applied on clauses whose POS (part-ofspeech) n-gram profiles were previously extracted. Uni-, biand trigrams were used on four 19th century French short stories by Maupassant. For high-dimensional embeddings, power transformations on t...

متن کامل

FUZZY K-NEAREST NEIGHBOR METHOD TO CLASSIFY DATA IN A CLOSED AREA

Clustering of objects is an important area of research and application in variety of fields. In this paper we present a good technique for data clustering and application of this Technique for data clustering in a closed area. We compare this method with K-nearest neighbor and K-means.  

متن کامل

Modification of the Fast Global K-means Using a Fuzzy Relation with Application in Microarray Data Analysis

Recognizing genes with distinctive expression levels can help in prevention, diagnosis and treatment of the diseases at the genomic level. In this paper, fast Global k-means (fast GKM) is developed for clustering the gene expression datasets. Fast GKM is a significant improvement of the k-means clustering method. It is an incremental clustering method which starts with one cluster. Iteratively ...

متن کامل

Integrating Fuzzy C-Means Clustering Technique with K-Means Clustering Technique for CBIR

Image database sizes have increased enormously in the recent years due to the development of the technology which has developed the need for Content Based Image Retrieval (CBIR) system. In this study a CBIR system that allows searching and retrieves images from the databases is developed using the fuzzy c-means algorithm and K-means clustering, the system uses the low level features like color,...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004